Skip to content

RAG: 401 on invalid OpenAI key; prevent silent empty results (#371/#362)#521

Closed
TimothiousAI wants to merge 2 commits intomainfrom
fix/371-auth-bubble-rag
Closed

RAG: 401 on invalid OpenAI key; prevent silent empty results (#371/#362)#521
TimothiousAI wants to merge 2 commits intomainfrom
fix/371-auth-bubble-rag

Conversation

@TimothiousAI
Copy link
Copy Markdown
Collaborator

@TimothiousAI TimothiousAI commented Aug 29, 2025

Summary

Surface OpenAI auth failures as HTTP 401 instead of silent empty results in RAG. Fix single-vector embedding call and guard reranking to avoid undefined variables.

Fixes: #362
Related: #371

What & Why

Users with invalid/expired OpenAI keys saw [] results and no guidance.

This change bubbles EmbeddingAuthenticationError → API returns 401 with "Invalid API key", so users get clear, actionable feedback.

Changes

Server

embedding_exceptions.py: add EmbeddingAuthenticationError.

embedding_service.py:

map openai.AuthenticationError → EmbeddingAuthenticationError.

in create_embedding(), re-raise EmbeddingAuthenticationError so it isn’t wrapped.

rag_service.py:

use single-vector create_embedding(text=...).

guard reranking with self.reranking_strategy is not None.

re-raise EmbeddingAuthenticationError to API.

api_routes/knowledge_api.py:

translate EmbeddingAuthenticationError to HTTP 401 for /api/rag/query (and keep code-examples behavior unchanged).

No schema/config changes.

How I Tested

Unit/functional (inside Docker archon-server):

Targeted RAG tests now pass after fixes:

tests/test_rag_simple.py::TestRAGIntegrationSimple::test_code_search_with_agentic_rag ✅

tests/test_rag_strategies.py::{test_full_rag_pipeline,test_error_handling_in_rag_pipeline} ✅

Quota behavior suite (selected) ✅

Note: tests/mcp_server/... are for the MCP image and are not run in the server container (fail to import mcp in this image). Unchanged by this PR.

Manual API checks (with bogus key):

POST /api/rag/query → 401 {"detail":"Invalid API key"} ✅

POST /api/rag/code-examples → unchanged behavior (not embedding-gated) ✅

UI sanity: with invalid key, crawls proceed until embedding phase; RAG search returns 401 instead of [].

Screenshots / Logs

curl to /api/rag/query → 401 (Invalid API key)

create_embedding("hello") raises EmbeddingAuthenticationError (expected).

Breaking Changes

None. Only error-mapping & guard logic.

Risks & Rollback

Risk: If any upstream code expected silent empty results, it will now see 401.

Rollback: Revert this PR; behavior returns to prior state.

Checklist

Branch on coleam00/Archon (fix/371-auth-bubble-rag)

Clear title & linked issues (#362, #371)

Server tests pass for changed areas (see notes on MCP tests)

No secrets committed; logs sanitize API keys

Matches CLAUDE.md coding rules and existing patterns

Docs/README not required (server-side error semantics only)

Verified in Docker (docker compose up --build -d)

Reviewer Notes

Key touchpoints:

python/src/server/services/embeddings/embedding_exceptions.py

python/src/server/services/embeddings/embedding_service.py

python/src/server/services/search/rag_service.py

python/src/server/api_routes/knowledge_api.py

Focus on the re-raise path for EmbeddingAuthenticationError and the single-vector embedding call.

Summary by CodeRabbit

  • Bug Fixes

    • Invalid API keys now consistently return HTTP 401 with “Invalid API key” and stop further processing to avoid partial results.
    • Improved handling of embedding failures to prevent inconsistent or partial outputs.
  • Refactor

    • Reranking now runs only when an explicit reranking strategy is configured, improving predictability.
    • Error propagation across search and knowledge flows unified so authentication issues surface reliably to the API.
  • Other

    • Responses now indicate whether reranking was applied and include improved debug logging.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 29, 2025

Walkthrough

Adds EmbeddingAuthenticationError and propagates it through embedding_service, rag_service, and knowledge_api to return HTTP 401 on authentication failures. Adjusts embedding calls to use a single-vector API parameter (text=). Simplifies reranking gating to depend solely on presence of a reranking_strategy and tracks reranking_applied.

Changes

Cohort / File(s) Summary of Changes
API layer auth handling
python/src/server/api_routes/knowledge_api.py
Import and handle EmbeddingAuthenticationError; log and raise HTTPException(status_code=401, detail={"error":"Invalid API key"}) from perform_rag_query and search_code_examples.
Embedding exceptions
python/src/server/services/embeddings/embedding_exceptions.py
Add EmbeddingAuthenticationError with api_key_prefix masking and metadata. Update EmbeddingAPIError.__init__ to accept original_error and record its type/message in metadata.
Embedding service auth propagation
python/src/server/services/embeddings/embedding_service.py
Map openai.AuthenticationErrorEmbeddingAuthenticationError; re-raise auth errors so they bubble out of create_embedding and create_embeddings_batch. Add inner/outer try/except to stop batch processing on auth failures.
RAG service flow updates
python/src/server/services/search/rag_service.py
Use create_embedding(text=query) single-vector call. Add explicit EmbeddingAuthenticationError handling in search_documents, perform_rag_query, and search_code_examples_service to propagate 401. Gate reranking by reranking_strategy is not None; add reranking_applied flag and debug log after successful reranking.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant API as knowledge_api
  participant RAG as rag_service
  participant EMB as embedding_service
  participant PROVIDER as Embedding Provider

  Client->>API: perform_rag_query / search_code_examples
  API->>RAG: search(query)
  RAG->>EMB: create_embedding(text=query)
  EMB->>PROVIDER: embed request
  alt Authentication failure
    PROVIDER-->>EMB: AuthenticationError
    EMB-->>RAG: EmbeddingAuthenticationError
    RAG-->>API: EmbeddingAuthenticationError
    API-->>Client: 401 {"error":"Invalid API key"}
  else Success
    PROVIDER-->>EMB: embedding vector
    EMB-->>RAG: vector
    RAG->>RAG: optionally rerank if reranking_strategy is not None
    RAG-->>API: results (includes reranking_applied)
    API-->>Client: 200 results
  end
Loading
sequenceDiagram
  autonumber
  participant RAG as rag_service
  participant Strat as reranking_strategy (optional)

  RAG->>RAG: Check if reranking_strategy is not None
  alt strategy is None
    RAG-->>RAG: Skip reranking (reranking_applied=false)
  else strategy provided
    RAG->>Strat: Rerank results
    Strat-->>RAG: Reranked results
    RAG-->>RAG: Set reranking_applied=true
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Assessment against linked issues

Objective (issue #) Addressed Explanation
Log and surface OpenAI API key errors when key lacks credits / cause failed RAG queries (#362) PR surfaces authentication errors as 401 and logs them; the issue pertains to key-credit/quota errors (e.g., 429 or insufficient-credits) which may not map to AuthenticationError—unclear if 429/quota scenarios are captured.

Poem

"I nibble logs and hop on keys,
If auth fails, I thump with ease.
Vectors stitched from single thread,
Rerank only when strategy's fed.
A rabbit cheers: 401 — carrots please! 🥕"

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/371-auth-bubble-rag

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@TimothiousAI
Copy link
Copy Markdown
Collaborator Author

Fixes #362

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
python/src/server/services/search/rag_service.py (1)

342-349: Fix: ‘reranking_applied’ may be reported True even when reranking fails.

Track success with a flag, not just strategy presence, and expose it in response and span.

-                # Apply reranking if we have a strategy
-                if self.reranking_strategy is not None and results:
-                    try:
-                        results = await self.reranking_strategy.rerank_results(
-                            query, results, content_key="content"
-                        )
-                    except Exception as e:
-                        logger.warning(f"Code reranking failed: {e}")
+                # Apply reranking if we have a strategy
+                reranking_applied = False
+                if self.reranking_strategy is not None and results:
+                    try:
+                        results = await self.reranking_strategy.rerank_results(
+                            query, results, content_key="content"
+                        )
+                        reranking_applied = True
+                    except Exception as e:
+                        logger.warning(f"Code reranking failed: {e}")
@@
                 response_data = {
                     "query": query,
                     "source_filter": source_id,
                     "search_mode": "hybrid" if use_hybrid_search else "vector",
-                    "reranking_applied": self.reranking_strategy is not None,
+                    "reranking_applied": reranking_applied,
                     "results": formatted_results,
                     "count": len(formatted_results),
                 }
@@
                 span.set_attribute("results_found", len(formatted_results))
                 span.set_attribute("hybrid_used", use_hybrid_search)
-                span.set_attribute("reranking_used", self.reranking_strategy is not None)
+                span.set_attribute("reranking_used", self.reranking_strategy is not None)
+                span.set_attribute("reranking_applied", reranking_applied)

Also applies to: 367-378

🧹 Nitpick comments (3)
python/src/server/api_routes/knowledge_api.py (1)

34-37: Remove unused import.

EmbeddingQuotaExhaustedError isn’t used in this module. Clean it up to satisfy Ruff (F401) and guidelines.

-from ..services.embeddings.embedding_exceptions import (
-    EmbeddingAuthenticationError,
-    EmbeddingQuotaExhaustedError,
-)
+from ..services.embeddings.embedding_exceptions import EmbeddingAuthenticationError
python/src/server/services/embeddings/embedding_service.py (2)

233-237: Map upstream auth failure to EmbeddingAuthenticationError and include masked key context.

Great mapping. For diagnosability, pass a masked prefix (never the full key).

-                                except openai.AuthenticationError as e:
+                                except openai.AuthenticationError as e:
                                     # Invalid API key - critical error, stop everything
                                     search_logger.error("Authentication failed: Invalid API key", exc_info=True)
-                                    raise EmbeddingAuthenticationError("Invalid API key") from e
+                                    from os import getenv
+                                    masked = (getenv("OPENAI_API_KEY", "")[:4] + "…") if getenv("OPENAI_API_KEY") else None
+                                    raise EmbeddingAuthenticationError("Invalid API key", api_key_prefix=masked) from e

72-87: Docstring: include EmbeddingAuthenticationError in Raises.

Reflect the new behavior for callers and API layer.

 async def create_embedding(text: str, provider: str | None = None) -> list[float]:
@@
-    Raises:
+    Raises:
+        EmbeddingAuthenticationError: When upstream API key is invalid/unauthorized
         EmbeddingQuotaExhaustedError: When OpenAI quota is exhausted
         EmbeddingRateLimitError: When rate limited
         EmbeddingAPIError: For other API errors
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bb57459 and c3cf1ee.

📒 Files selected for processing (4)
  • python/src/server/api_routes/knowledge_api.py (3 hunks)
  • python/src/server/services/embeddings/embedding_exceptions.py (1 hunks)
  • python/src/server/services/embeddings/embedding_service.py (5 hunks)
  • python/src/server/services/search/rag_service.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
python/src/{server,mcp,agents}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/{server,mcp,agents}/**/*.py: Fail fast on service startup failures, missing configuration, database connection issues, auth failures, critical dependency outages, and invalid data that would corrupt state
External API calls should use retry with exponential backoff and ultimately fail with a clear, contextual error message
Error messages must include context (operation being attempted) and relevant IDs/URLs/data for debugging
Preserve full stack traces in logs (e.g., Python logging with exc_info=True)
Use specific exception types; avoid catching broad Exception unless re-raising with context
Never signal failure by returning None/null; raise a descriptive exception instead

Files:

  • python/src/server/api_routes/knowledge_api.py
  • python/src/server/services/embeddings/embedding_exceptions.py
  • python/src/server/services/embeddings/embedding_service.py
  • python/src/server/services/search/rag_service.py
python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: For batch processing and background tasks, continue processing but log detailed per-item failures and return both successes and failures
Do not crash the server on a single WebSocket event failure; log the error and continue serving other clients

Files:

  • python/src/server/api_routes/knowledge_api.py
  • python/src/server/services/embeddings/embedding_exceptions.py
  • python/src/server/services/embeddings/embedding_service.py
  • python/src/server/services/search/rag_service.py
python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting and Mypy for type checking before commit

Files:

  • python/src/server/api_routes/knowledge_api.py
  • python/src/server/services/embeddings/embedding_exceptions.py
  • python/src/server/services/embeddings/embedding_service.py
  • python/src/server/services/search/rag_service.py
{python/**/*.py,archon-ui-main/src/**/*.{ts,tsx,js,jsx}}

📄 CodeRabbit inference engine (CLAUDE.md)

{python/**/*.py,archon-ui-main/src/**/*.{ts,tsx,js,jsx}}: Remove dead code immediately; do not keep legacy/unused functions
Avoid comments that reference change history (e.g., LEGACY, CHANGED, REMOVED); keep comments focused on current functionality

Files:

  • python/src/server/api_routes/knowledge_api.py
  • python/src/server/services/embeddings/embedding_exceptions.py
  • python/src/server/services/embeddings/embedding_service.py
  • python/src/server/services/search/rag_service.py
python/src/server/**

📄 CodeRabbit inference engine (CLAUDE.md)

Keep FastAPI application code under python/src/server/ (routes in api_routes/, services in services/, main in main.py)

Files:

  • python/src/server/api_routes/knowledge_api.py
  • python/src/server/services/embeddings/embedding_exceptions.py
  • python/src/server/services/embeddings/embedding_service.py
  • python/src/server/services/search/rag_service.py
python/src/{server/services,agents}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never accept or store corrupted data (e.g., zero embeddings, null foreign keys, malformed JSON); skip failed items entirely instead of persisting bad data

Files:

  • python/src/server/services/embeddings/embedding_exceptions.py
  • python/src/server/services/embeddings/embedding_service.py
  • python/src/server/services/search/rag_service.py
🧬 Code graph analysis (3)
python/src/server/api_routes/knowledge_api.py (2)
python/src/server/services/embeddings/embedding_exceptions.py (2)
  • EmbeddingAuthenticationError (86-98)
  • EmbeddingQuotaExhaustedError (46-58)
python/src/server/config/logfire_config.py (1)
  • safe_logfire_error (238-250)
python/src/server/services/embeddings/embedding_service.py (1)
python/src/server/services/embeddings/embedding_exceptions.py (1)
  • EmbeddingAuthenticationError (86-98)
python/src/server/services/search/rag_service.py (2)
python/src/server/services/embeddings/embedding_exceptions.py (1)
  • EmbeddingAuthenticationError (86-98)
python/src/server/services/embeddings/embedding_service.py (1)
  • create_embedding (72-132)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Backend Tests (Python + pytest)
🔇 Additional comments (7)
python/src/server/services/embeddings/embedding_service.py (3)

111-113: Good: let auth errors bubble to HTTP 401.


280-283: Good: immediate bubble-up of auth errors during batch loop.


333-336: Good: top-level re-raise preserves 401 semantics.

python/src/server/services/search/rag_service.py (4)

116-118: Good: use single-vector create_embedding(text=...).


144-146: Good: propagate auth failures to avoid silent empty results.


236-243: Reranking gating looks good.

Proceeding only when a strategy exists and results are present is correct.


381-383: Good: bubble auth failures to API layer from code examples path.

Comment thread python/src/server/api_routes/knowledge_api.py
Comment on lines +794 to 797
except EmbeddingAuthenticationError as e:
safe_logfire_error(f"Authentication error in code examples search: {str(e)}")
raise HTTPException(status_code=401, detail="Invalid API key")
except Exception as e:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Same here: standardize the 401 response body.

-        raise HTTPException(status_code=401, detail="Invalid API key")
+        raise HTTPException(status_code=401, detail={"error": "Invalid API key"})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except EmbeddingAuthenticationError as e:
safe_logfire_error(f"Authentication error in code examples search: {str(e)}")
raise HTTPException(status_code=401, detail="Invalid API key")
except Exception as e:
except EmbeddingAuthenticationError as e:
safe_logfire_error(f"Authentication error in code examples search: {str(e)}")
raise HTTPException(status_code=401, detail={"error": "Invalid API key"})
except Exception as e:
🤖 Prompt for AI Agents
In python/src/server/api_routes/knowledge_api.py around lines 794 to 797, the
except block raises HTTPException(status_code=401, detail="Invalid API key")
which returns an ad-hoc string; replace this with the project's standardized 401
response body (either by returning/importing the shared unauthorized response
object/function used elsewhere, e.g. API_RESPONSES.UNAUTHORIZED or a
get_401_response() helper) so the 401 payload matches the rest of the API;
update the raise to use that shared constant/function and import it at the top
of the file.

Comment thread python/src/server/services/embeddings/embedding_exceptions.py
… prefix; accurate reranking_applied; remove unused import
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
python/src/server/services/embeddings/embedding_exceptions.py (1)

94-99: Prefer neutral naming for logged key context

Rename the metadata key to avoid implying an unmasked prefix. Keeps semantics clear in logs.

Apply:

-        if masked:
-            self.metadata["api_key_prefix"] = masked
+        if masked:
+            self.metadata["api_key_hint"] = masked
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c3cf1ee and 2c3b8ef.

📒 Files selected for processing (3)
  • python/src/server/api_routes/knowledge_api.py (3 hunks)
  • python/src/server/services/embeddings/embedding_exceptions.py (1 hunks)
  • python/src/server/services/search/rag_service.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • python/src/server/api_routes/knowledge_api.py
  • python/src/server/services/search/rag_service.py
🧰 Additional context used
📓 Path-based instructions (6)
python/src/{server,mcp,agents}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/{server,mcp,agents}/**/*.py: Fail fast on service startup failures, missing configuration, database connection issues, auth failures, critical dependency outages, and invalid data that would corrupt state
External API calls should use retry with exponential backoff and ultimately fail with a clear, contextual error message
Error messages must include context (operation being attempted) and relevant IDs/URLs/data for debugging
Preserve full stack traces in logs (e.g., Python logging with exc_info=True)
Use specific exception types; avoid catching broad Exception unless re-raising with context
Never signal failure by returning None/null; raise a descriptive exception instead

Files:

  • python/src/server/services/embeddings/embedding_exceptions.py
python/src/{server/services,agents}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never accept or store corrupted data (e.g., zero embeddings, null foreign keys, malformed JSON); skip failed items entirely instead of persisting bad data

Files:

  • python/src/server/services/embeddings/embedding_exceptions.py
python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: For batch processing and background tasks, continue processing but log detailed per-item failures and return both successes and failures
Do not crash the server on a single WebSocket event failure; log the error and continue serving other clients

Files:

  • python/src/server/services/embeddings/embedding_exceptions.py
python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting and Mypy for type checking before commit

Files:

  • python/src/server/services/embeddings/embedding_exceptions.py
{python/**/*.py,archon-ui-main/src/**/*.{ts,tsx,js,jsx}}

📄 CodeRabbit inference engine (CLAUDE.md)

{python/**/*.py,archon-ui-main/src/**/*.{ts,tsx,js,jsx}}: Remove dead code immediately; do not keep legacy/unused functions
Avoid comments that reference change history (e.g., LEGACY, CHANGED, REMOVED); keep comments focused on current functionality

Files:

  • python/src/server/services/embeddings/embedding_exceptions.py
python/src/server/**

📄 CodeRabbit inference engine (CLAUDE.md)

Keep FastAPI application code under python/src/server/ (routes in api_routes/, services in services/, main in main.py)

Files:

  • python/src/server/services/embeddings/embedding_exceptions.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Backend Tests (Python + pytest)
🔇 Additional comments (2)
python/src/server/services/embeddings/embedding_exceptions.py (2)

86-101: Auth exception + masked key: LGTM

Specific exception type and masked key handling look correct and align with the PR goals.


86-101: End-to-end 401 handling and embedding call verification complete: EmbeddingAuthenticationError is re-thrown in embedding_service, mapped to HTTP 401 in both RAG query and code examples routes, single-vector create_embedding(text=…) is in place, and reranking is gated by reranking_strategy.

TimothiousAI pushed a commit that referenced this pull request Aug 29, 2025
Integrates critical fixes from PR #521 into Thomas's comprehensive error handling infrastructure:

- RAG Service: Add EmbeddingAuthenticationError import and propagation
- Embedding Service: Fix single embedding function to properly convert openai.AuthenticationError → EmbeddingAuthenticationError
- Embedding Service: Improve error type detection in batch failure handling
- RAG Service: Let embedding errors bubble up instead of converting to generic errors

Now OpenAI authentication errors properly return HTTP 401 instead of HTTP 500:
- Before: {"error": "RAG query failed: ..."}  [HTTP 500]
- After: {"error": "OpenAI API authentication failed", "error_type": "authentication_failed"}  [HTTP 401]

Tested with both invalid and valid API keys. All endpoints working correctly.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@leex279
Copy link
Copy Markdown
Collaborator

leex279 commented Sep 6, 2025

Included in #371

@leex279 leex279 closed this Sep 6, 2025
@Wirasm Wirasm deleted the fix/371-auth-bubble-rag branch April 6, 2026 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 [Bug]: Missing Error handling on Open AI API key Error 429

2 participants